AITopics | representation space

Collaborating Authors

representation space

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

NegoCollab: ACommon Representation Negotiation Approach for Heterogeneous Collaborative Perception

Neural Information Processing SystemsJun-18-2026, 22:47:01 GMT

Collaborative perception improves task performance by expanding the perception range through information sharing among agents. Immutable heterogeneity poses a significant challenge in collaborative perception, as participating agents may employ different and fixed perception models. This leads to domain gaps in the intermediate features shared among agents, consequently degrading collaborative performance. Aligning the features of all agents to a common representation can eliminate domain gaps with low training cost. However, in existing methods, the common representation is designated as the representation of a specific agent, making it difficult for agents with significant domain discrepancies from this specific agent to achieve proper alignment.

artificial intelligence, machine learning, representation, (18 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre: Research Report > Experimental Study (0.46)

Industry:

Transportation (0.46)
Information Technology (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

The Dual Nature of Plasticity Loss in Deep Continual Learning: Dissection and Mitigation

Neural Information Processing SystemsJun-17-2026, 21:37:11 GMT

Loss of plasticity (LoP) is the primary cause of cognitive decline in normal aging brains next to cell loss. Recent works show that similar LoP also plagues neural networks during deep continual learning (DCL). While it has been shown that random perturbations of learned weights can alleviate LoP, its underlying mechanisms remain insufficiently understood. Here we offer a unique view of LoP and dissect its mechanisms through the lenses of an innovative framework combining the theory of neural collapse and finite-time Lyapunov exponents (FTLE) analysis. We show that LoP actually consists of two contrasting types: (i) type-1 LoP is characterized by highly negative FTLEs, where the network is prevented from learning due to the collapse of representations; (ii) while type-2 LoP is characterized by excessively positive FTLEs, where the network can train well but the growingly chaotic behaviors reduce its test accuracy. Based on these understandings, we introduce Generalized Mixup, designed to relax the representation space for prolonged DCL and demonstrate its superior efficacy vs. existing methods.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Encouraging metric-aware diversity in contrastive representation space

Neural Information Processing SystemsJun-17-2026, 11:02:41 GMT

In cooperative Multi-Agent Reinforcement Learning (MARL), agents that share policy network parameters often learn similar behaviors, which hinders effective exploration and can lead to suboptimal cooperative policies. Recent advances have attempted to promote multi-agent diversity by leveraging the Wasserstein distance to increase policy differences. However, these methods cannot effectively encourage diverse policies due to ineffective Wasserstein distance caused by the policy similarity. To address this limitation, we propose Wasserstein Contrastive Diversity (WCD) exploration, a novel approach that promotes multi-agent diversity by maximizing the Wasserstein distance between the trajectory distributions of different agents in a latent representation space. To make the Wasserstein distance meaningful, we propose a novel next-step prediction method based on Contrastive Predictive Coding (CPC) to learn distinguishable trajectory representations. Additionally, we introduce an optimized kernel-based method to compute the Wasserstein distance more efficiently. Since the Wasserstein distance is inherently defined for two distributions, we extend it to support multiple agents, enabling diverse policy learning. Empirical evaluations across a variety of challenging multi-agent tasks demonstrate that WCD outperforms existing state-of-the-art methods, delivering superior performance and enhanced exploration.

artificial intelligence, machine learning, wasserstein distance, (16 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Promising Solution (0.68)
Research Report > New Finding (0.68)

Industry:

Leisure & Entertainment > Games > Computer Games (0.94)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

DISCO: Disentangled Communication Steering for Large Language Models

Neural Information Processing SystemsJun-17-2026, 08:12:19 GMT

In contrast, we propose to inject steering vectors directly into the query and value representation spaces within attention heads. We provide evidence that a greater portion of these spaces exhibit high linear discriminability of concepts -a key property motivating the use of steering vectors-than attention head outputs. We analytically characterize the effect of our method, which we term DISentangled COmmunication (DISCO) Steering, on attention head outputs. Our analysis reveals that DISCO disentangles a strong but underutilized baseline, steering attention head inputs, which implicitly modifies queries and values in a rigid manner. In contrast, DISCO's direct modulation of these components enables more granular control. We find that DISCO achieves superior performance over a number of steering vector baselines across multiple datasets on LLaMA 3.1 8B and Gemma 2 9B, with steering efficacy scoring up to 19.1%higher than the runner-up. Our results support the conclusion that the query and value spaces are powerful building blocks for steering vector methods. Our code is publicly available at https://github.com/MaxTorop/DISCO.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.27)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Energy (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Long-Tailed Recognition via Information-Preservable Two-Stage Learning

Neural Information Processing SystemsJun-16-2026, 20:24:52 GMT

The imbalance (or long-tail) is the nature of many real-world data distributions, which often induces the undesirable bias of deep classification models toward frequent classes, resulting in poor performance for tail classes. In this paper, we propose a novel two-stage learning approach to mitigate such a majority-biased tendency while preserving valuable information within datasets. Specifically, the first stage proposes a new representation learning technique from the information theory perspective. This approach is theoretically equivalent to minimizing intraclass distance, yielding an effective and well-separated feature space. The second stage develops a novel sampling strategy that selects mathematically informative instances, able to rectify majority-biased decision boundaries without compromising a model's overall performance. As a result, our approach achieves state-of-the-art performance across various long-tailed benchmark datasets.

accuracy, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

λ-Orthogonality Regularization for Compatible Representation Learning

Neural Information Processing SystemsJun-15-2026, 20:18:33 GMT

Retrieval systems rely on representations learned by increasingly powerful models. However, due to the high training cost and inconsistencies in learned representations, there is significant interest in facilitating communication between representations and ensuring compatibility across independently trained neural networks. In the literature, two primary approaches are commonly used to adapt different learned representations: affine transformations, which adapt well to specific distributions but can significantly alter the original representation, and orthogonal transformations, which preserve the original structure with strict geometric constraints but limit adaptability. A key challenge is adapting the latent spaces of updated models to align with those of previous models on downstream distributions while preserving the newly learned representation spaces. In this paper, we impose a relaxed orthogonality constraint, namely λ-Orthogonality regularization, while learning an affine transformation, to obtain distribution-specific adaptation while retaining the original learned representations. Extensive experiments across various architectures and datasets validate our approach, demonstrating that it preserves the model's zero-shot performance and ensures compatibility across model updates.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Europe (0.46)
North America > United States (0.28)
Asia (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Enhancing Time Series Forecasting through Selective Representation Spaces: APatch Perspective

Neural Information Processing SystemsJun-15-2026, 14:48:39 GMT

Time Series Forecasting has made significant progress with the help of Patching technique, which partitions time series into multiple patches to effectively retain contextual semantic information into a representation space beneficial for modeling long-term dependencies. However, conventional patching partitions a time series into adjacent patches, which causes a fixed representation space, thus resulting in insufficiently expressful representations. In this paper, we pioneer the exploration of constructing a selective representation space to flexibly include the most informative patches for forecasting. Specifically, we propose the Selective Representation Space (SRS) module, which utilizes the learnable Selective Patching and Dynamic Reassembly techniques to adaptively select and shuffle the patches from the contextual time series, aiming at fully exploiting the information of contextual time series to enhance the forecasting performance of patch-based models. To demonstrate the effectiveness of SRS module, we propose a simple yet effective SRSNet consisting of SRS and an MLP head, which achieves state-of-the-art performance on real-world datasets from multiple domains. Furthermore, as a novel plug-and-play module, SRS can also enhance the performance of existing patch-based models.

data mining, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: North America (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Energy (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Unsupervised Federated Graph Learning

Neural Information Processing SystemsJun-13-2026, 05:58:31 GMT

Federated graph learning (FGL) is a privacy-preserving paradigm for modeling distributed graph data, designed to train a powerful global graph neural network. Existing FGL methods predominantly rely on label information during training, effective FGL in an unsupervised setting remains largely unexplored territory. In this paper, we address two key challenges in unsupervised FGL: 1) Local models tend to converge in divergent directions due to the lack of shared semantic information across clients. Then, how to align representation spaces among multiple clients is the first challenge.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

NegoCollab: A Common Representation Negotiation Approach for Heterogeneous Collaborative Perception

Neural Information Processing SystemsJun-12-2026, 23:35:11 GMT

agent, artificial intelligence, representation, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.39)

Add feedback

Transferring Linear Features Across Language Models With Model Stitching

Neural Information Processing SystemsJun-11-2026, 23:46:26 GMT

In this work, we demonstrate that affine mappings between residual streams of language models is a cheap way to effectively transfer represented features between models. We apply this technique to transfer the \textit{weights} of Sparse Autoencoders (SAEs) between models of different sizes to compare their representations. We find that small and large models learn highly similar representation spaces, which motivates training expensive components like SAEs on a smaller model and transferring to a larger model at a FLOPs savings. For example, using a small-to-large transferred SAE as initialization can lead to 50% cheaper training runs when training SAEs on larger models. Next, we show that transferred probes and steering vectors can effectively recover ground truth performance. Finally, we dive deeper into feature-level transferability, finding that semantic and structural features transfer noticeably differently while specific classes of functional features have their roles faithfully mapped. Overall, our findings illustrate similarities and differences in the linear representation spaces of small and large models and demonstrate a method for improving the training efficiency of SAEs.

artificial intelligence, machine learning, proceedings, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback